Wikipedia Based Approach for Clustering Keyword of Reviews

نویسندگان

  • Ming Jiang
  • Wencao Yan
  • Xingqi Wang
  • Jingfan Tang
  • Chunming Wu
چکیده

A novel method based on Wikipedia for clustering keyword of reviews is proposed. Users can quickly finding the themes they interest through it. First the method extracts keywords, then calculates word similarity based on Wikipedia to generate similarity matrix, finally uses k-means to cluster. The performance is better than the methods which based on How-net and Word-net. The accuracy is around 77%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Spectral Clustering Wikipedia Keyword-Based Search Results

The paper summarizes our research in the area of unsupervised categorization of Wikipedia articles. As a practical result of our research, we present an application of spectral clustering algorithm used for grouping Wikipedia search results. The main contribution of the paper is a representation method for Wikipedia articles that has been based on combination of words and links and used for cat...

متن کامل

Self-Organizing Map Representation for Clustering Wikipedia Search Results

The article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality...

متن کامل

Keyword Extraction using Multiple Novel Features

In this paper, we propose a novel approach for keyword extraction. Different from previous keyword extraction methods, which identify keywords based on the document alone, this approach introduces Wikipedia knowledge and document genre to extract keywords from the document. Keyword extraction is accomplished by a classification model utilizing not only traditional word based features but also f...

متن کامل

Automatic Content-Based Categorization of Wikipedia Articles

Wikipedia’s article contents and its category hierarchy are widely used to produce semantic resources which improve performance on tasks like text classification and keyword extraction. The reverse – using text classification methods for predicting the categories of Wikipedia articles – has attracted less attention so far. We propose to “return the favor” and use text classifiers to improve Wik...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JSW

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014